3574 results found.
Speech
Corpus,
Language Type:
Monolingual
Languages:
Arabic Bengali Dari Egyptian Arabic English Georgian Hindi Iranian Persian Italian Japanese Khmer Korean Lao Mandarin Chinese Min Nan Chinese Moroccan Arabic Panjabi Persian Russian Spanish Tagalog Thai Tigrinya Urdu
Availability:
From Owner
License:
LDC
Size:
950 hoursProduction Status:
Existing-updated
Use:
Language Identification
-
Paper title:Modeling and training strategies for language recognition systems
-
Paper track:4.1 Language identification and verification, lang/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Raphaël Duroselle | 2008 NIST Speaker Recognition Evaluation Training Set Part 2 | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Not Available
License:
Size:
240000 Production Status:
Existing-used
Use:
Speech Recognition/Understanding
-
Paper title:Adjunct-Emeritus Distillation for Semi-Supervised Language Model Adaptation
-
Paper track:14.16 Privacy-preserving Machine Learning for Audi/Poster Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Scott Novotney | Proprietory Interactive Voice Response Data | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Bilingual
Languages:
English German
Availability:
Freely Available
License:
see readme
Size:
15 GByteProduction Status:
Newly created-finished
Use:
Machine Learning
-
Paper title:NISQA - A Deep CNN-Self-Attention Model for Multidimensional Speech Quality Prediction with Crowdsourced Datasets
-
Paper track:5.11 Speech and audio quality assessment/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Gabriel Mittag | NISQA Speech Quality Corpus | /N |
Documentation:
https://github.com/gabrielmittag/NISQA/wiki/NISQA-Corpus
Video and Language
Vision-Language Dataset,
Language Type:
Monolingual
Languages:
English
Availability:
From Owner
License:
MIT
Size:
510 videos OtherProduction Status:
Newly created-finished
Use:
Machine Learning
-
Paper title:A hierarchical approach to vision-based language generation: from simple sentences to complex natural language
-
Paper track:Long paper/
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Simion-Vlad Bogolin | Videos-to-Paragraphs Dataset | /N |
Documentation:
No
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Size:
1.7 billion comments OtherProduction Status:
Existing-used
Use:
Language Modelling
-
Paper title:Exploring the Value of Personalized Word Embeddings
-
Paper track:Short paper/
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Charles Welch | Reddit Comments Corpus | /N |
Documentation:
None
Written
Ontology,
Language Type:
Monolingual
Languages:
English
Availability:
From Owner
License:
MIT
Size:
39,617 synsets Production Status:
Existing-updated
Use:
Textual Entailment and Paraphrasing
-
Paper title:Leveraging WordNet Paths for Neural Hypernym Prediction
-
Paper track:Long paper/
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Yejin Cho | WN18RR-hp | /N |
Documentation:
None
Written
Lexicon,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Size:
None Production Status:
Existing-used
Use:
Animacy detection
-
Paper title:Living Machines: A study of atypical animacy
-
Paper track:Long paper/
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Mariona Coll Ardanuy | Wordnet | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Size:
None Production Status:
Existing-updated
Use:
Corpus Creation/Annotation
-
Paper title:Living Machines: A study of atypical animacy
-
Paper track:Long paper/
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Mariona Coll Ardanuy | Digitised Books. c. 1510 - c. 1900 | /N |
Documentation:
None
Written
Grammar/Language Model,
Language Type:
Monolingual
Languages:
English
Availability:
Will be freely available upon publication
License:
Size:
None Production Status:
Newly created-finished
Use:
Language Modelling
-
Paper title:Living Machines: A study of atypical animacy
-
Paper track:Long paper/
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Mariona Coll Ardanuy | 19thC English BERT models | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Will be freely available upon publication
License:
Size:
500 KByte Production Status:
Newly created-finished
Use:
Animacy detection
-
Paper title:Living Machines: A study of atypical animacy
-
Paper track:Long paper/
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Mariona Coll Ardanuy | 19thC Machines animacy dataset | /N |
Documentation:
None




